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1.  Introduction 


Objectives  for  an  international  CTBT  (Comprehensive  Test  Ban  Treaty)  monitoring 
system,  stated  in  the  United  States  Working  Paper  for  the  Committee  on  Disarmament 
(May  18,  1994),  are  detection  and  identification  of  nuclear  explosions  down  to  a  few 
kilotons  or  less.  Difficulties  associated  with  seismic  event  identification  down  to  this  level 
include  the  fact  that  small  events  may  be  seen  at  only  a  single  station  and,  annually,  there 
are  many  thousands  of  competing  earthquakes,  mining  blasts  and  other  mining-induced 
events  worldwide.  Thus,  robust  automated  methods  are  needed  to  make  use  of  limited 
information  and  to  reduce  the  workload  of  human  analysts  to  a  manageable  level. 

Furthermore,  observed  seismic  signals  from  small  sources  are  generally  quite  complicated 
and  exhibit  dramatic  dependence  on  regional  geophysics;  even  the  most  promising 
discriminants  can  vary  widely  in  different  region.  Hence,  region-specific  information 
regarding  seismic  discriminants  is  vital  in  distinguishing  nuclear  explosions  from  other 
events.  Unfortunately,  relevant  ground-truth  data,  particularly  for  small  underground 
nuclear  explosions,  do  not  exist  for  most  regions,  including  many  of  current  interest.  In 
addition,  it  has  yet  to  be  shown  that  discrimination  rules,  established  in  a  given  region  for 
which  data  exist,  can  be  transported  effectively  to  a  new  region.  Thus,  in  most  cases, 
seismic  event  identification,  within  the  context  of  CTBT  or  NPT  (Non-Proliferation 
Treaty)  monitoring,  is  a  problem  of  detecting  unusual  events,  i.e.,  outliers. 

In  addition,  precise  statistical  metrics  are  needed  to  quantify  identification  performance  to 
meet  the  specific  needs  of  policy  use.  In  this  regard,  good  control  of  the  one  of  the  error 
rates  is  needed  so  that  the  false  alarm  rate,  for  example,  is  not  overwhelming  and,  if  there 
is  an  alert,  the  probability  that  an  error  was  made  is  known  precisely.  It  may  further  be 
desirable  to  rank  events  in  order  to  focus  on  the  most  “suspicious”  ones. 

Thus  this  effort  has  concentrated  on  developing  and  applying  statistical  methods  to  perform 
seismic  event  identification  and  on  quantifying  identification  capabilities  with  regard  to 
seismic  monitoring  of  a  CTBT  or  NPT.  We  have  been  developing  and  applying  a  statistical 
framework  for  seismic  event  identification  with  these  and  other  considerations  in  mind. 

The  fundamental  methods  consist  of  statistical  tests  for  outlier  detection  and  classification, 
as  well  as  preliminary  statistical  analyses  to  test  appropriate  assumptions  regarding  the  data 
in  order  to  optimize  the  results  and  ensure  self-consistency.  Our  methodology  provides  a 
statistical  framework  for  seismic  event  identification  that:  (1)  accurately  treats  statistical 
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fluctuations  of  all  discriminants  used;  (2)  can  identify  nuclear  tests  in  regions  for  which 
relevant  ground-truth  training  data  may  or  may  not  exist;  (3)  provides  flexibility  to 
incorporate  any  discriminant  and  assess  its  utility  objectively;  (4)  can  function  in  an 
automated  mode  to  flag  and/or  rank  suspicious  events;  (5)  can  control  the  false  alarm  rate; 
(6)  precisely  quantifies  identification  performance;  (7)  and  provides  a  rigorous  and 
defensible  framework  with  which  to  report  and  justify  the  results.  To  also  serve  as  an 
interactive  analyst  and  research  tool,  we  have  been  developing  an  X  Window  graphical  user 
interface,  featuring  a  wide  range  of  graphical  displays  and  exploratory  data  analysis  tools. 

Our  recent  work  has  focused  on  assessing  regional  event  identification  capabilities  using  a 
method  we  developed  for  outlier  detection.  This  is  motivated  by  the  fact  that  historical 
underground  nuclear  tests  have  been  performed  in  only  a  few  regions.  Our  procedure  may 
be  fully  automated  to  flag  events  warranting  special  attention  and  to  test  all  appropriate 
assumptions  to  ensure  validity  of  the  results.  In  addition,  the  method  allows  straightforward 
control  of  false  alarm  rates  and  a  natural  way  to  rank  events.  We  have  applied  this  approach 
to  seismic  events  in  diverse  geological  regions,  recorded  by  the  ARCESS  and  GERESS 
arrays  in  Norway  and  Germany,  CDSN  (Chinese  Digital  Seismic  Network)  station  WMQ 
in  China,  and  LNN  (Livermore  NTS  Network)  stations  KNB  and  MNV  in  the  western  U.S. 

We  have  also  examined  effects  of  contaminated  training  data  by  intentionally  including 
quarry  blasts  or  rock  bursts  in  earthquake  training  sets  to  determine  if  the  outlier  test  can 
detect  them  and  to  assess  potential  impacts  on  monitoring  performance.  We  also  used  our 
classification  test  to  identify  these  events  to  illustrate  how  it  can  be  used  to  improve 
identification  performance  for  cases  in  which  more  than  one  training  set  exist.  Last,  we 
repeated  our  identification  analysis  of  the  31  December  1992  Novaya  Zemlya  event,  after 
first  applying  distance  corrections. 

In  Section  2,  we  describe  ground-truth  data  sets  from  ARCESS,  GERESS,  WMQ,  KNB 
and  MNV,  which  we  use  as  a  testbed  to  assess  regional  event  identification  performance. 
We  describe  which  discriminants  perform  robustly  in  the  various  regions  and  provide  a 
comparison  of  Pn/Lg  among  the  regions.  In  Section  3,  we  describe  the  procedures  for 
outlier  detection  and  for  event  classification.  The  latter  approach  applies  to  cases  for  which 
training  data  for  more  than  one  event  type  exist  for  a  given  region.  In  Section  4,  we  present 
results  of  monitoring  applications.  In  Section  5,  we  discuss  conclusions  regarding  the 
current  status  of  regional  identification  performance  using  our  statistical  framework  and 
provide  recommendations  for  future  research  and  development  efforts. 
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2.  Seismic  Data  Sets  and  Regional  Discriminants 

For  this  study  we  used  feature  (i.e.,  discriminant)  data  provided  by  Baumgardt  (1993a)  for 
seismic  events  recorded  by  the  ARCESS  and  GERESS  regional  arrays  and  CDSN  (Chinese 
Digital  Seismic  Network)  station  WMQ.  ARCESS  events  include  24  earthquakes  (EQs) 
near  Steigen,  Norway,  5  earthquakes  near  Spitsbergen,  53  quarry  blasts  (QBs)  on  the  Kola 
Peninsula,  39  quarry  blasts  in  the  Kiruna  region  of  Sweden,  and  3  underground  nuclear 
explosions  (EXs)  at  the  Novaya  Zemlya  test  site.  GERESS  events  include  10  earthquakes 
and  13  quarry  blasts  in  the  Vogtland  region  of  western  Bohemia  (straddling  Germany  and 
the  Czech  Republic)  and  30  rock  bursts  (RBs)  in  the  Lubin  region  of  Poland.  (The  Steigen, 
Vogtland  and  Lubin  data  sets  are  included  in  the  CSS  Ground-Truth  Database  and  are 
discussed  in  more  detail  by  Grant  et  al.,  1993.)  WMQ  events  include  23  earthquakes  in 
China  and  nearby  countries,  16  nuclear  explosions  in  Kazakhstan,  and  1  nuclear  explosion 
at  the  Lop  Nor  test  site  in  China.  In  addition,  we  also  used  feature  data  provided  by  Patton 
and  Walter  (1994)  for  76  earthquakes  and  141  nuclear  explosions  at  NTS  (Nevada  Test 
Site),  recorded  by  LNN  (Livermore  NTS  Network)  stations  KNB  and  MNV.  Figure  1 
depicts  locations  of  the  seismic  stations,  arrays  and  most  of  the  events  considered  in  this 
study.  Table  1  summarizes  the  events,  their  epicentral  distances  and  magnitudes. 


Figure  1.  Locations  of  the  ARCESS  and  GERESS  arrays,  CDSN  station  WMQ,  LNN  stations  KNB  and 
MNV,  and  most  of  the  seismic  events  used  in  our  regional  event  identiOcation  study. 
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A  significant  aspect  of  this  combined  data  set  is  its  diversity.  First,  the  geological  regions 
represented  here  are  very  diverse.  In  fact,  geologies  of  the  western  U.S.  and  Scandinavia 
are  dramatically  (if  not  extremely)  different.  Second,  epicentral  distances  range  from  100 
to  1300  km,  covering  nearly  the  full  range  of  what  are  considered  regional  distances.  Third, 
the  events  have  a  broad  range  of  magnitudes  from  1.0  to  6.1.  Fourth,  there  are  different 
types  of  events  which  were  nearly  collocated,  such  as  the  events  recorded  by  GERESS  and, 
to  some  extent,  those  recorded  by  KNB  and  MNV.  There  are  also  stations,  such  as  WMQ, 
for  which  events  were  observed  for  a  full  range  of  distances  and  azimuths.  Last,  the  types 
and  configurations  of  seismic  instrumentation  differ.  ARCESS  and  GERESS  are  regional 
arrays  (Mykkeltveit  et  al.,  1990;  Harjes,  1990),  while  WMQ,  MNV  and  KNB  are  each 
single  stations.  Thus,  the  combined  data  set  provides  a  broad  characteristic  sample  of  the 
types  of  regional  events  that  will  be  encountered  when  monitoring  a  CTBT. 


Table  1:  Summary  of  seismic  data  sets  used  in  our  study. 


Array/station 

Events 

Distance  (km) 

Magnitude 

ARCESS 

24  Steigen  EQs 

385-480 

1. 0-3.2 

5  Spitsbergen  EQs 

795-1320 

1.5-2.9 

53  Kola  QBs 

300-430 

2.0-3.2 

39  Kirana  QBs 

250-295 

1, 6-2.0 

3  NZ  EXs 

1100 

>3.9 

GERESS 

10  Vogtland  EQs 

140-260 

1.4-3. 2 

13  Vogtland  QBs 

165-210 

2.0-2.6 

30  Lubin  RBs 

340-350 

1. 8-3.3 

WMQ 

23  EQs 

100-1100 

4.2-5.9 

16  Kazakh  EXs 

950 

4.8-6.1 

1  Lop  Nor  EX 

240 

4.7 

KNB 

59  NTS  EQs 

295-310 

2. 1-5 .9 

89  NTS  EXs 

280-315 

2.4-5.5 

MNV 

37  NTS  EQs 

250-260 

2.2-5.9 

78  NTS  EXs 

190-245 

2.6-5.5 

Discriminants  considered  in  this  study  include  Pn/Lg  and  Pn/Sn  in  3-5,  4-6,  5-7,  6-8  Hz 
bands,  and  an  Lg  spectral  ratio.  Of  these,  only  Pn/Lg  in  the  6-8  Hz  band  and  Lg  spectral 
ratio  measurements  were  provided  by  LLNL  for  KNB  and  MNV.  Also,  Sn  was  measured 
for  only  2  of  17  explosions  recorded  by  WMQ.  In  the  remainder  of  this  section,  we  describe 
the  discriminant  data  in  more  detail  for  each  of  the  data  sets  at  the  various  stations  or  arrays. 
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Baumgardt  et  al.  (1992)  describe  how  these  discriminants  are  computed  from  seismic 
waveforms  for  the  ARCESS,  GERESS  and  WMQ  data  sets,  while  Walter  et  al.  (1994) 
provide  similar  descriptions  for  the  KNB  and  MNV  data  sets. 

2.1.  WMQ  Data  Set 

Figure  2  shows  locations  of  CDSN  station  WMQ,  16  nuclear  explosions  at  the  Balapan  test 
site  in  Kazakhstan,  the  29  September  1988  nuclear  explosion  at  the  Lop  Nor  test  site  in 
China,  and  23  earthquakes  scattered  mostly  around  the  northwest  portion  of  China  and 
western  Mongolia.  For  WMQ,  Pn/Lg  and  Pg/Lg  measurements  were  made  with  the  ISEIS 
system  (Baumgardt  and  Der,  1994)  on  both  the  midband  (bz)  and  high  frequency  (sz) 
vertical-component  channels,  with  sampling  rates  of  20  and  40  Hz,  respectively.  In  this 
study  we  focused  on  measurements  of  waveforms  on  the  sz  channel. 


Figure  2.  Locations  of  CDSN  station  WMQ,  16  nuclear  explosions  in  Kazakhstan,  the  880929  nuclear 
explosion  at  Lop  Nor  and  23  earthquakes  in  China  and  nearby  countries. 

Figure  3  shows  Pn/Lg  and  Pg/Lg  values  (before  applying  distance  corrections)  for  these 
events,  based  on  maximum  amplitude  measurements  in  nine  frequency  bands  ranging  from 
0.5  to  16  Hz.  The  mid-range  frequency  bands  (3-8  Hz)  provide  the  best  separation  of 
earthquakes  and  explosions.  Since  Lg  attenuates  more  rapidly  with  distance  than  Pn  and  Pg 
in  this  region  and  the  Lop  Nor  explosion  is  only  200  km  from  WMQ,  it  has  the  smallest  Pn/ 
Lg  and  Pg/Lg  values  of  all  the  explosions  and  similar  values  in  high  frequency  bands  (6-16 
Hz)  to  earthquakes  at  much  greater  distances. 


5 


1000  I 

100 

^  10 

X 

< 

S  1 


&£) 

J 

0.1  U-  S 


0.01 

0.001 

l.E-4 


China:  WMQ  Pn/Lg(MAX) 


X  Explosion 
□  Earthquake 


0  O 


□  Q 

°  □  O  □  □ 

0  0  □  O  „ 

0  H 


o  □  □ 


□ 

□  □ 


O 

1 

o 


o 

IT) 


q 

•sh 


1 

O 


O 

CO 


I 

o 

00 


I 

q 

CO 


China:  WMQ  Pg/Lg(MAX) 


qqiooooooo 

cNrh^tintDr^cooto 

I  1  I  1  !  1  I  T  T 

qqqoooooo 

ocNCNro^j-in’tdcdcd 


Figure  3.  Scatter  plots  of  Pn/Lg  (left)  and  Pg/Lg  (right)  in  nine  frequency  bands,  before  applying 
distance  corrections,  for  17  nuclear  explosions  and  23  earthquakes  recorded  by  station  WMQ. 


Figure  4  (left)  shows  the  distance  dependence  of  uncorrected  Pn/Lg  values  in  the  6-8  Hz 
band.  Similar  dependence  is  exhibited  in  the  other  frequency  bands,  as  well  as  for  Pg/Lg, 
although  there  are  frequency  and  phase  dependent  variations.  Note  that  the  Lop  Nor 
explosion  at  200  km  has  the  same  uncorrected  Pn/Lg(6-8  Hz)  value  as  the  earthquake  with 
a  distance  to  WMQ  of  roughly  900  km.  Earthquakes  with  epicentral  distances  less  than  300 
km  exhibit  considerable  scatter  in  their  Pn/Lg  and  Pg/Lg  values. 


WMQ:  Uncorrected  Pn/Lg(6-8  Hz)  WMQ:  Corrected  Pn/Lg(6-8  Hz) 


Figure  4.  Plots  of  uncorrected  (left)  and  distance-corrected  (right)  Pn/Lg  values  in  the  6-8  Hz  band 
versus  epicentral  distance  for  WMQ  events. 
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To  remove  this  dependence,  we  derived  and  applied  distance  corrections  to  the  amplitude 
ratios.  To  do  this,  we  performed  a  least  squares  fit,  using  the  earthquake  data,  and  assuming 
a  standard  form  of  the  relation  of  Pn/Lg  to  distance  (e.g.,  Sereno,  1990):  Pn/Lg(f)(;oj.je(,ted 
(A/Ao)“^^  Pn/Lg(f)uncorrected’  where  A  is  the  epicentral  distance  of  the  event,  Aq  is  a 
reference  distance  (taken  to  be  200  km),  and  a(f)  is  a  frequency  dependent  coefficient 
which  is  solved  for  in  the  least  squares  procedure.  The  problem  is  linearized  with  respect 
to  a(f)  by  taking  the  logarithm  of  both  sides  of  this  expression.  Figure  4  (right)  shows  the 
distance-corrected  Pn/Lg(6-8  Hz)  values.  The  distance-corrected  Pn/Lg  and  Pg/Lg  values 
for  the  Lop  Nor  explosion  are  all  now  well  within  those  for  the  Balapan  nuclear  explosions. 
There  is  still  considerable  scatter  in  Pn/Lg  values  for  earthquakes  at  close  distances.  More 
events  at  these  distances  are  desired  to  verify  or  improve  the  derived  distance  corrections. 


Figure  5  shows  scatter  plots  of  the  Pn/Lg  and  Pg/Lg  values  for  the  WMQ  events  after 
applying  the  derived  distance  corrections.  Pn/Lg  measurements  in  3-5, 4-6,  5-7  and  6-8  Hz 
bands  provide  the  best  discrimination  between  explosions  and  earthquakes  at  WMQ.  Pg/Lg 
measurements  in  3-5,  4-6  and  5-7  Hz  bands  also  separate  explosions  and  earthquakes  at 
WMQ.  We  did  not,  however,  use  Pg/Lg  in  the  study  presented  below  to  assess  regional 
identification  performance  since  Pg/Lg  does  not  discriminate  as  consistently  in  various 
regions  as  Pn/Lg  and  Pn/Sn.  Also,  Sn  measurements  were  available  for  only  2  of  the  17 
explosions  recorded  by  WMQ.  Thus,  we  could  not  use  Pn/Sn  as  a  discriminant  at  WMQ. 
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Figure  5.  Scatter  plots  of  Pn/Lg  (left)  and  Pg/Lg  (right)  in  nine  frequency  bands,  after  applying  distance 
corrections,  for  17  nuclear  explosions  and  23  earthquakes  recorded  by  station  WMQ. 
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Figure  6  shows  values  of  Lg  spectral  ratios  for  the  WMQ  events  based  on  maximum  (left) 
and  rms  (right)  measurements  of  Lg  spectra  in  three  combinations  of  frequency  bands.  The 
plots  show  that  Lg  spectral  ratios  do  not  provide  effective  discrimination  in  this  region. 
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Figure  6.  Scatter  plots  of  Lg  spectral  ratios  for  nuclear  explosions  and  earthquakes  recorded  by  WMQ, 
based  on  maximum  (left)  and  rms  (right)  measurements  of  Lg  spectra  in  three  combinations  of 
frequency  bands  on  the  sz  channel. 


2.2.  ARCESS  Data  Set 

ARCESS  events  used  in  this  study  include  24  earthquakes  in  the  Steigen  region  of  northern 
Norway,  5  earthquakes  near  Spitsbergen,  53  quarry  blasts  on  the  Kola  Peninsula,  39  quarry 
blasts  in  the  Earuna  region  of  Sweden,  and  3  underground  nuclear  explosions  at  the  Novaya 
Zemlya  test  site.  Figure  7  shows  the  locations  of  these  events  and  the  ARCESS  array.  Table 
1  summarizes  their  magnitudes  and  epicentral  distances  to  the  ARCESS  array.  We  also 
analyzed  an  event  which  occurred  on  Novaya  Zemlya  on  31  December  1992  (921231). 
This  event  is  labelled  by  its  origin  identification  number  (GRID  =  361575)  in  Figure  7. 


Figure  8  shows  Pn/Lg  values  in  nine  frequency  bands  for  the  Kola  Peninsula  and  Kiruna 
quarry  blasts  and  Steigen  earthquakes,  before  (left)  and  after  (right)  applying  distance 
corrections.  Pn/Lg  measurements  were  not  available  for  the  4  Novaya  Zemlya  events,  nor 
the  5  earthquakes  near  Spitsbergen,  since  Lg  does  not  propagate  efficiently  beneath  the 
Barents  Sea.  Distance  corrections  used  here  were  obtained  from  Sereno  (1990)  in  which  he 
derived  distance  relations  for  Pn,  Pg,  Lg  and  Sn  in  Fennoscandia  based  on  97  regional 
events  recorded  in  common  by  ARCESS  and  NORESS.  All  events  were  reported  in  the 
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Helsinki  Bulletin.  (Distance  corrections  for  Lg  may  be  invalid  for  frequencies  above  5  Hz 
due  to  attenuation  of  high-frequency  Lg  for  events  at  far  regional  distances  in  this  region. 
At  present,  however,  these  are  the  best  corrections  available.)  Pn/Lg  measurements  in  the 
4-6  and  5-7  Hz  bands  provide  the  best  discrimination  of  the  Steigen  earthquakes  and  Kola 
quarry  blasts.  Pn/Lg  values  for  the  Kiruna  quarry  blasts  are  more  similar  to  those  for  the 
Steigen  earthquakes  before  applying  distance  corrections,  while  more  like  those  for  the 
Kola  quarry  blasts  after.  (Cf.  Table  1  for  distances  of  these  events  to  ARCESS.) 


Figure  9  shows  scatter  plots  of  Pn/Sn  maximum  amplitude  measurements  in  six  frequency 
bands,  before  (left)  and  after  (right)  applying  distance  corrections,  for  events  recorded  by 
the  ARCESS  array.  These  events  include  53  Kola  quarry  blasts,  2  Kiruna  quarry  blasts,  24 
Steigen  earthquakes,  5  earthquakes  near  Spitsbergen,  3  nuclear  explosions  at  the  Novaya 
Zemlya  test  site,  and  the  921231  Novaya  Zemlya  event.  The  legends  associate  the  marker 
type  with  the  event  type.  Distance  corrections  for  Pn/Sn  were  also  obtained  from  Sereno 
(1990).  Pn/Sn  values  were  provided  for  only  2  of  the  39  Kiruna  quarry  blasts. 

There  are  several  points  to  make  regarding  these  data.  First,  Pn/Sn  values  for  the  Novaya 
Zemlya  explosions  are  higher  than  those  for  the  earthquakes,  except  in  the  3-5  Hz  band,  for 
which  Pn/Sn  for  an  earthquake  near  Spitsbergen  is  greater  than  values  in  this  band  for  all 
three  nuclear  explosions  and  most  quarry  blasts.  Second,  Pn/Sn  does  not  discriminate 
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nuclear  explosions  from  quarry  blasts  at  ARCESS,  particularly  after  applying  distance 
corrections.  Third,  Pn/Sn  values  for  the  Steigen  and  Spitsbergen  earthquakes  are  more 
consistent  with  one  another  after  applying  distance  corrections.  Fourth,  Pn/Sn  values  for 
the  921231  event  are  more  consistent  with  those  for  quarry  blasts  before  distance- 
correcting,  while  more  consistent  with  earthquakes  after.  Note,  however,  that  the  different 
frequency  bands  provide  different  evidence  as  to  the  identification  of  the  921231  event;  its 
Pn/Sn(6-8  Hz)  value  seems  peculiarly  high  as  compared  to  other  bands. 
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Figure  8.  Scatter  plots  of  Pn/Lg  in  nine  frequency  bands,  before  (left)  and  after  (right)  applying 
distance  corrections,  for  53  quarry  blasts  on  the  Kola  Peninsula,  39  quarry  blasts  in  the  Kiruna  region 
in  Sweden  and  24  earthquakes  near  Steigen,  Norway,  recorded  by  the  ARCESS  array. 


Figure  9.  Scatter  plots  of  Pn/Sn  in  six  frequency  bands,  before  (left)  and  after  (right)  applying  distance 
corrections,  for  53  Kola  and  2  Kiruna  quarry  blasts,  24  Steigen  and  5  Spitsbergen  earthquakes,  3 
Novaya  Zemlya  nuclear  explosions,  and  the  921231  Novaya  Zemlya  event,  recorded  by  ARCESS. 
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We  were  provided  with  Lg  spectral  ratios  for  the  quarry  blasts  and  one  of  the  Steigen 
earthquakes,  but  none  were  available  for  the  Novaya  Zemlya,  Spitsbergen,  and  remainder 
of  the  Steigen  events.  Thus,  we  were  unable  to  use  Lg  spectral  ratios  for  the  ARCESS 
events  in  studies  of  regional  identification  performance. 

2.3.  GERESS  Data  Set 

This  data  set  includes  13  quarry  blasts  and  10  earthquakes  which  occurred  in  the  Vogtland 
region,  roughly  180  km  northwest  of  the  GERESS  array.  It  also  consists  of  30  rock  bursts 
or  induced  mine  tremors  in  the  Lubin  Copper  Basin  in  Poland,  roughly  350  km  northeast 
of  GERESS.  Figure  7  shows  locations  of  these  events  relative  to  GERESS.  (See  Grant  et 
al.,  1993,  for  further  details  regarding  these  events.)  Figure  10  shows  Pn/Lg  (left)  and  Pn/ 
Sn  (right)  values  in  nine  frequency  bands,  based  on  maximum  amplitude  measurements. 
(No  distance  corrections  were  applied  to  these  data.  However,  the  Vogtland  quarry  blasts 
and  earthquakes  are  nearly  collocated  and  their  propagation  distances  are  roughly  equal  to 
the  reference  distance,  Aq  =  200  km,  relative  to  which  corrections  are  applied.) 
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Figure  10.  Scatter  plots  of  Pn/Lg  deft)  and  Pn/Sn  (right)  in  nine  frequency  bands  for  13  quarry  blasts, 
10  earthquakes  and  30  rock  bursts  recorded  by  the  GERESS  array. 


The  highest  frequency  bands  provide  the  best  discrimination  of  earthquakes  and  quarry 
blasts.  Pn/Lg  values  for  the  Lubin  rock  bursts  are  very  similar  to  those  for  the  Vogtland 
earthquakes,  while  their  Pn/Sn  values  fall  between  the  values  for  the  earthquakes  and 
quarry  blasts.  This  may  be  due  to  the  fact  that  the  rock  bursts  were  shallow  (less  than  1  km 
deep),  generating  considerable  Lg.  Note  that  distance  corrections  would  cause  Pn/Lg  and 
Pn/Sn  values  for  the  Lubin  events  to  be  even  more  like  those  for  the  Vogtland  earthquakes. 
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Figure  1 1  shows  Lg  spectral  ratios,  based  on  maximum  (left)  and  rms  (right)  measurements 
of  Lg  spectra  in  three  combinations  of  frequency  bands,  for  the  Vogtland  quarry  blasts  and 
earthquakes  and  Lubin  rockbursts  recorded  by  GERESS.  Except  for  one  quarry  blast,  the 
earthquakes  and  quarry  blasts  separate  completely.  Baumgardt  et  al.  (1992)  showed  that 
this  ripple-fired  quarry  blast  generated  large  modulations  in  the  observed  spectrum, 
anomalously  enriching  the  high  frequency  content.  Lg  spectral  ratios  for  the  Lubin  rock 
bursts  fall  between  the  values  for  the  earthquakes  and  quarry  blasts.  This  is  likely  due  to 
their  shallow  depth  and  near-source  medium  properties,  as  well  as  a  mixture  of  source 
characteristics,  since  some  of  the  tremors  were  induced  by  explosions  (Baumgardt,  1994). 
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Figure  11.  Lg  spectral  ratios  for  earthquakes,  quarry  blasts  and  rockbursts  recorded  by  GERESS, 
based  on  maximum  (left)  and  rms  (right)  measurements  in  three  combinations  of  frequency  bands. 


2.4.  KNB  and  MNV  Data  Sets 

Patton  and  Walter  (1994)  provided  us  with  data  for  a  total  of  76  earthquakes,  141  nuclear 
explosions,  and  1  contained  1  kt  chemical  explosion  at  NTS,  recorded  by  LNN  stations 
KNB  and  MNV.  The  chemical  explosion  was  the  September  1993  Non-Proliferation 
Experiment  (NPE).  Due  to  signal-to-noise  limitations  for  some  events,  phase  ratio  and  low- 
to-high  spectral  measurements  were  computed  for  59  earthquakes  and  89  explosions  of  the 
events  recorded  by  KNB.  Similar  measurements  were  computed  for  37  earthquakes  and  78 
explosions  of  those  recorded  by  MNV.  Measurements  for  the  NPE  were  provided  for  both 
stations.  Two  other  LNN  stations,  ELK  and  LAC,  did  not  record  many  of  these  earthquakes 
due  to  intermittent  operation  during  the  period  when  they  occurred  (Walter  et  al.,  1994). 
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Figure  12  shows  a  map  of  the  southwestern  U.S.  depicting  locations  of  LNN  stations  KNB 
and  MNV  relative  to  NTS.  On  the  left  of  Figure  12  is  an  enlarged  map  of  NTS  showing 
epicenters  of  the  nuclear  explosions  and  earthquakes  used  in  this  study.  The  earthquakes, 
ranging  from  magnitude  (ML)  2.1  to  5.9,  were  clustered  in  three  main  locations:  Little 
Skull  Mountain,  Rock  Valley,  and  one  at  Massachusetts  Mountain.  Walter  et  al.  (1994) 
provide  a  detailed  discussion  of  the  mainshocks  and  aftershock  sequences,  their  focal 
mechanisms,  and  a  description  of  relevant  geology.  Most  of  the  Little  Skull  Mountain 
sequence  occurred  between  6  to  12  km  in  depth  (Meremonte  et  al.,  1994),  while  the  depth 
of  the  Rock  Valley  sequence  was  constrained  to  1-3  km,  based  on  measurements  by  a 
portable  instrument  located  1.5  km  from  the  epicenter  (Smith  and  Brune,  1993).  The 
explosions  ranged  in  magnitude  from  ML  2.4  to  5.5  and  in  depth  from  200  to  nearly  700 
meters,  in  media  with  a  relatively  wide  range  of  properties. 


Figure  12.  Map  of  the  southwestern  U.S.  showing  locations  of  LNN  stations  KNB  and  MNV  relative  to 
NTS.  On  the  left  is  an  enlarged  map  of  NTS  showing  epicenters  of  the  nuclear  explosions  and 
earthquakes  used  in  this  study.  (Courtesy  of  W.  Walter  of  LLNL.) 

This  data  set  has  several  interesting  features  with  regard  to  assessing  regional  identification 
performance  in  the  context  of  seismic  monitoring  of  a  CTBT.  First,  regional  geology  in  the 
western  U.S.  complements  that  in  the  other  regions  considered  in  this  study.  Second,  the 
events  have  similar  paths,  in  contrast  to  previous  western  U.S.  data  sets  (e.g.,  Taylor  et  al., 
1989).  Thus,  source  effects  are  better  isolated  from  regional  path  variations.  Third,  there 
are  earthquakes  and  nuclear  explosions  of  widely  varying  magnitude,  including  explosions 
of  magnitude  2.5  or  less.  Fourth,  there  are  both  shallow  and  normal-depth  earthquakes. 
Fifth,  there  are  nuclear  explosions  at  various  depths  in  media  of  varying  properties;  source 
medium  properties  such  as  density,  P-wave  velocity,  and  gas  porosity  are  available  for  all 
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of  the  explosions.  These  features  provide  a  unique  data  set  with  which  to  better  understand 
effects  of  depth  and  source  medium  properties  on  regional  discriminants. 

Figure  13  shows  Pn/Lg  and  Pg/Lg  measurements  in  the  6-8  Hz  band  for  the  events  recorded 
by  KNB  and  MNV.  (Note  that  Sn  measurements  were  not  available  since  Sn  does  not 
propagate  efficiently  in  the  western  U.S.)  The  explosions  are  separated  by  those  which 
were  detonated  in  media  with  high  versus  low  gas  porosity  (GP)  and  the  earthquakes  are 
separated  into  subregions  in  which  they  occurred.  The  legend  associates  the  marker  types 
with  the  various  events.  Pn/Lg  values  at  MNV  provide  the  best  discrimination  of 
earthquakes  and  explosions  in  this  region,  while  Pg/Lg  does  not  discriminate  as  well.  There 
is  more  overlap  of  both  Pn/Lg  and  Pg/Lg  for  the  two  event  types  at  KNB  than  at  MNV. 


Figure  13.  Scatter  plots  of  Pn/Lg  and  Pg/Lg  measurements  in  the  6-8  Hz  band  for  seismic  events 
recorded  by  LNN  stations  KNB  and  MNV. 

Walter  et  al.  (1994)  noted  that  Pn/Lg  and  Pg/Lg  exhibit  differing  values  for  shallow  than 
for  deeper  earthquakes,  with  different  dependence  at  the  two  stations  and  for  the  two  types 
of  phase  ratios,  possibly  due  to  radiation  pattern  or  path  effects.  There  does  not  appear  to 
be  significant  difference  between  Pn/Lg  and  Pg/Lg  values  for  explosions  in  media  of  high 
versus  low  gas  porosity,  except  for  Pg/Lg  at  MNV  for  which  explosions  in  media  of  high 
gas  porosity  typically  have  lower  Pg/Lg  values.  By  averaging  P/Lg  values  over  both 
stations,  Walter  et  al.  (1994)  found  that  Pg/Lg  values  are  generally  lower  for  explosions  in 
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high  gas  porosity  media,  while  no  such  dependence  was  observed  for  Pn/Lg.  Figure  14 
shows  that  Pn/Lg  in  the  6-8  Hz  band  exhibits  some  dependence  on  magnitude,  ML(coda), 
at  MNV  (right)  and  more  noticeably  at  KNB  (left),  although  this  is  actually  a  signal-to- 
noise  effect.  The  same  is  also  true  of  Pg/Lg. 


Note  that  the  Pn/Lg  and  Pg/Lg  values  for  the  NPE  are  among  the  highest  for  explosions 
seen  at  MNV,  while  its  Pn/Lg  value  at  KNB  falls  in  the  middle  of  the  explosion  group  and 
its  Pg/Lg  value  at  KNB  is  among  the  smallest.  The  cause  of  this  differing  behavior  is  not 
immediately  apparent.  However,  it  is  clear  that  Pn/Lg  and  Pg/Lg  do  not  discriminate 
nuclear  and  single  contained  chemical  explosions  in  this  region  and,  very  likely,  elsewhere. 
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Figure  14.  Pn/Lg  measurements  in  the  6-8  Hz  band  versus  ML(coda)  for  events  recorded  by  KNB  (left) 
and  MNV  (right). 


Note  that  discrimination  using  Pn/Lg(6-8  Hz)  appears  to  be  better  at  MNV  than  KNB  for 
two  reasons.  First,  Pn/Lg(6-8  Hz)  measurements  for  the  shallow  Rock  Valley  earthquakes 
are  noticeably  high  at  KNB.  Second,  due  to  better  signal-to-noise  ratio  (SNR)  at  KNB  than 
MNV,  there  are  measurements  at  KNB  for  more  of  the  small  events.  Although  these  small 
events  seen  at  ICNB  passed  a  SNR  test,  poorer  SNR  for  small  magnitude  events  contributes 
to  more  overlap  of  Pn/Lg  values  for  earthquakes  and  explosions. 


Figure  15  shows  Lg  spectral  ratio  values  of  three  different  types  for  the  same  events.  The 
first  two  are  ratios  of  Lg  coda  measurements  in  the  1-2  Hz  versus  6-8  or  8-10  Hz  bands. 
The  third  is  the  ratio  of  direct  Lg  measurements  in  the  1-2  Hz  versus  6-8  Hz  bands.  Of  these, 
Lgcoda(l-2  Hz/8-10  Hz)  at  MNV  discriminates  the  best  although,  due  to  poorer  SNR  in  the 
8-10  Hz  band,  Lgcoda(l-2  Hz/6-8  Hz)  was  measured  for  more  events  and  provides  nearly 
as  effective  discrimination.  Thus,  Lgcoda(l-2  Hz/6-8  Hz)  was  used,  along  with  Pn/Lg(6-8 
Hz)  in  the  identification  studies  described  in  Section  4. 
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Lgcodo(1 -2/6-8)  Lgcoda(l -2/8-1 0)  Lg(l -2/6-3) 

Figure  15.  Scatter  plots  of  Lg  spectral  ratio  measurements  of  three  different  types  for  seismic  events 
recorded  by  LNN  stations  KNB  and  MNV,  Events  are  separated  into  subgroups  as  in  Figure  13. 

Figure  16  shows  that  Lg  spectral  ratios  exhibit  significant  magnitude  dependence.  In  fact, 
Walter  et  al.  ( 1 994)  noted  that  none  of  the  spectral  ratios  discriminate  well  for  events  below 
ML  3.5.  For  explosions,  they  also  showed  that  Pn,  Pg  and  Lg  spectral  ratios  exhibit  strong 
dependence  on  gas  porosity  of  the  near-source  medium,  as  a  function  of  magnitude,  and  not 
explicitly  on  event  depth. 


Figure  16.  Spectral  ratio  (1-2  Hz/6-8  Hz)  measurements  of  Lg  coda  versus  ML(coda)  for  events 
recorded  by  KNB  (left)  and  MNV  (right).  Lg  spectral  ratios  exhibit  significant  magnitude  dependence 
and  do  not  discriminate  earthquakes  and  explosions  in  this  region  below  ML  3.5. 
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In  summary,  Pn/Lg(6-8  Hz)  provides  the  best  discrimination,  particularly  at  MNV,  and 
varies  some,  but  not  dramatically,  with  magnitude  or  source  medium  properties.  Pn/Lg 
does  exhibit  some  dependence  on  source  depth  for  earthquakes  seen  at  KNB,  with  higher 
values  for  shallower  earthquakes,  making  them  more  difficult  to  discriminate  from 
explosions  there.  This  may  be  due,  at  least  partially,  to  radiation  pattern  or  path  effects.  Pg/ 
Lg  does  not  discriminate  as  well  as  Pn/Lg  and  appears  to  be  more  sensitive  to  near-source 
and  path  effects.  Pg/Lg  does  not  vary  dramatically  with  magnitude,  nor  consistently  with 
depth  or  source  medium  properties  from  one  station  to  the  other.  Pg/Lg  at  MNV  or  station- 
averaged  does,  however,  exhibit  correlation  to  gas  porosity  for  the  explosions.  Lg  spectral 
ratios  provide  fairly  good  discrimination  for  events  above  ML  3.5,  particularly  for  high  gas 
porosity  explosions.  Lg  spectral  ratios  do  exhibit  strong  correlation  to  magnitude,  as  well 
as  to  gas  porosity,  with  no  significant  correlation  explicitly  to  depth. 

2.5.  Data  Set  Comparisons  and  Pn/Lg  Transportability 

Comparisons  of  high-frequency  discriminants  for  the  different  regions  represented  by  the 
various  data  sets  in  this  study  are  interesting  from  the  standpoint  of  understanding  where 
certain  discriminants  work  and  effects  of  regional  tectonic  variations.  First,  Lg  spectral 
ratios  appear  to  discriminate  at  GERESS  and  in  the  western  U.S.,  but  not  at  WMQ  or 
ARCESS.  Second,  high-frequency  Pg/Lg  discriminates  fairly  well  at  WMQ,  GERESS  and 
in  the  western  U.S.,  but  not  at  ARCESS.  Third,  high-frequency  Pn/Lg  and  Pn/Sn  appear  to 
discriminate  robustly  in  most  regions  for  which  Lg  or  Sn  are  observed.  In  many  cases  both 
Pn/Lg  and  Pn/Sn  can  be  measured  and  used,  while  in  others  either  Sn  or  Lg  is  attenuated 
or  blocked  by  regional  structures.  P/Sn  does  not  appear  to  be  a  viable  discriminant  at  WMQ 
and  in  the  western  U.S.  since  Sn  signals  are  too  weak  in  these  regions,  particularly  for 
explosions.  P/Lg  does  not  appear  to  be  useful  for  events  with  paths  beneath  oceans  nor  for 
paths  with  thin  crustal  structure  (e.g.,  Baumgardt,  1990).  These  observations  are  consistent 
with  those  of  Bennett  et  al.  (1989),  Baumgardt  et  al.  (1992)  and  many  others. 

It  has  long  been  recognized  that  different  seismic  phases  attenuate  at  different  rates  as 
functions  of  distance  and  frequency,  even  for  relatively  simple  earth  structures.  In  general, 
Sn  and  Lg  attenuate  more  rapidly  with  distance  than  Pn  or  Pg.  Thus,  events  at  far  regional 
distances  may  appear  to  be  more  explosion-like  than  ones  at  closer  distances.  This  allows 
for  the  possibility  that  nearby  explosions  or  distant  earthquakes  may  be  misidentified.  An 
example  is  the  identification  of  a  921231  Novaya  Zemlya  event,  roughly  1100  km  from 
ARCESS.  Before  applying  distance  corrections  to  the  Pn/Sn  discriminants,  Baumgardt 
(1993b),  Fisk  and  Gray  (1993)  and  Pulli  and  Dysart  (1993)  found  this  event  to  be  far  more 
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consistent  with  quarry  blasts  on  the  Kola  Peninsula  than  earthquakes  in  Scandinavia,  whose 
distances  ranged  from  300  to  500  km.  (All  authors  agreed  that  this  event  as  not  a  nuclear 
detonation.)  These  results  are  inconsistent  with  a  statement  by  the  Russian  Federation  that 
no  blasting  had  occurred.  After  applying  distance  corrections,  Fisk  (1993)  found  that  this 
event  was  rejected  as  a  quarry  blast.  In  addition.  Figure  4  illustrates  how  the  880929  Lop 
Nor  nuclear  explosion  recorded  by  WMQ  could  be  misidentified  using  uncorrected  Pn/Lg 
values.  Thus,  distance-correcting  Pn/Lg  and  Pn/Sn  is  very  important. 

Also,  the  optimal  frequency  band  for  Pn/Lg  and  Pn/Sn  depends  on  the  characteristics  of  the 
seismic  instruments  and  filter  functions  used,  as  well  as  on  regional  geophysical  effects. 
For  example,  Pn/Lg  and  Pn/Sn  at  4-7  Hz  discriminate  the  best  at  ARCESS,  while  those  at 
8-16  Hz  discriminate  much  better  at  GERESS.  In  general,  Pn/Lg  and  Pn/Sn  in  the  3-8  Hz 
band  appears  to  discriminate  the  most  consistently  in  all  regions  we  have  studied. 

Although  these  discriminants  typically  separate  earthquake  and  explosion  groups,  the 
threshold  between  the  two  groups  can  vary  dramatically  in  different  regions.  Figure  17,  for 
example,  shows  plots  of  uncorrected  (left)  and  distance-corrected  (right)  Pn/Lg  in  the  6-8 
Hz  band  for  the  various  events  recorded  by  ARCESS,  GERESS,  WMQ,  MNV,  and  KNB. 
The  discrimination  threshold  for  uncorrected  Pn/Lg  at  WMQ  is  roughly  a  factor  of  ten 
greater  than  those  at  the  other  stations.  The  distance-corrected  thresholds  are  all  fairly 
similar  except  for  the  threshold  at  ARCESS,  which  differs  by  almost  an  order  of  magnitude. 
(Note  that  the  distance  corrections  for  Pn/Lg  at  ARCESS  may  be  invalid  for  this  frequency 
band,  although  these  were  the  best  corrections  available.)  Even  for  the  distance-corrected 
case,  a  threshold  obtained  for  KNB  would  lead  to  missed  nuclear  explosions  at  WMQ. 
Similarly,  the  threshold  for  WMQ  would  lead  to  roughly  a  30%  false  alarm  rate  at  KNB. 
Since,  in  most  cases,  there  is  no  clean  separation  between  the  earthquake  and  explosion 
groups,  the  threshold  must  be  set  very  precisely  in  order  to  have  high  identification  and  low 
false  alarm  rates.  Thus,  a  discrimination  rule  established  in  one  region  does  not  readily 
apply  to  another.  (Note  that  the  outlier-detection  approach  we  use  in  Section  4  does  not  rely 
on  transporting  discrimination  thresholds.) 

Relative  path  corrections  may  improve  Pn/Lg  transportability.  Progress  has  been  made  by 
Baumgardt  and  Der  (1994)  and  Lay  and  Zhang  (1994)  in  relating  Pn/Lg  values  to  tectonic 
(crustal  and  topographic)  properties,  although  there  are  non-negligible  uncertainties. 
Evidence  suggests  that  empirical  path  corrections  and  theoretical  modeling  must  be  very 
accurate  to  succeed.  Station  corrections  may  also  be  needed.  Note  that  many  of  the  same 
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events  were  observed  at  MNV  and  KNB,  while  the  Pn/Lg  discrimination  thresholds  differ 
somewhat  for  the  two  stations.  Practical  difficulties  include  the  fact  that  a  common  set  of 
regional  events,  needed  to  derive  corrections  for  a  station  lacking  nuclear  explosion  data, 
is  unlikely  to  also  be  recorded  at  stations  for  which  such  data  exist.  As  a  result  of  these 
complications,  it  has  yet  to  be  shown  that  a  regional  high-frequency  discrimination 
threshold  from  one  region  can  be  successfully  used  to  identify  events  in  a  new  region. 


Figure  17.  Uncorrected  (left)  and  distance-corrected  (right)  Pn/Lg(6-8  Hz)  for  events  recorded  by 
ARCESS,  GERESS,  WMQ,  MNV  and  KNB.  Horizontal  lines  depict  discrimination  thresholds  for  each 
station. 

3.  Technical  Approach 

3.1.  Motivation 

The  objective  is  to  identify  underground  nuclear  explosions  down  to  a  few  kilotons  (kt)  or 
less,  possibly  as  low  as  1  kt  decoupled.  A  1  kt  decoupled  explosion  corresponds  roughly  to 
a  magnitude  2.5  event  or,  equivalently,  the  size  of  a  relatively  large  mining  blast  (Blandford 
et  al.,  1992).  Above  this  threshold,  Ringdal  (1984)  and  Lilwall  and  Douglas  (1985)  estimate 
that  there  will  be  roughly  200,000  earthquakes  worldwide  per  year.  In  addition,  there  will 
be  a  large  number  of  quarry  blasts  and  induced  mine  tremors  that  occur  annually 
worldwide,  although  there  is  some  debate  as  to  explicit  numbers.  Nevertheless,  it  is  clear 
that  there  will  be  many  thousands  of  competing  seismic  signals,  mostly  from  earthquakes 
and  mining  blasts,  from  which  to  try  to  identify  a  potential  underground  nuclear  test. 

Our  goal  is  to  provide  an  automated  procedure  to  flag  suspicious  event  which  warrant 
special  attention  so  that  they  can  be  investigated  further  by  expert  analysts  and,  possibly, 
by  non-seismic  techniques  (e.g.,  on-site  inspection).  Unfortunately  we  lack  seismic 
calibration  data  for  nuclear  explosions  in  most  regions  around  the  world.  In  addition,  as 
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discussed  above  in  Section  2.5,  it  has  yet  to  be  shown  that  a  regional  high-frequency 
discrimination  threshold  from  one  region  can  be  successfully  transported  in  order  to 
identify  events  in  a  new  region.  Thus,  in  most  cases,  event  identification,  within  the  context 
of  CTBT  or  NPT  monitoring,  is  a  problem  of  detecting  unusual  events,  i.e.,  outliers. 

For  regions  in  which  training  information  is  available  for  more  than  one  class  of  events 
(e.g.,  for  regions  which  include  historical  test  sites)  we  also  developed  a  classification  test, 
based  on  the  same  methodology  as  the  outlier  test.  This  test  provides  improved 
identification  accuracy  over  the  outlier  test  for  cases  in  which  more  information  is 
available.  Below  we  primarily  concentrate  on  monitoring  results  obtained  using  the  outlier 
test  because  of  its  more  general  applicability. 

3.2.  Outlier  Detection  Approach 

If  a  station  exists  or  has  been  recently  installed,  the  first  step  is  to  collect  a  set  of  events 
recorded  by  the  station.  In  most  practical  situations,  we  will  not  know  the  event  types,  a 
priori.  An  assumption  is  made  that  the  number  of  new  nuclear  tests  in  a  region  will  be 
relatively  small  compared  to  the  number  of  earthquakes,  mining  blasts  or  rock  bursts.  If  this 
assumption  fails,  i.e.,  if  a  region  is  aseismic  and  with  no  mining  activity,  then  any  new 
event  would  be  suspicious,  warranting  further  investigation.  Since  there  are  only  a  small 
number  of  mining  blasts  that  occur  annually  above  mb  3,  our  primary  concern  for 
monitoring  above  this  threshold  is  to  distinguish  nuclear  explosions  from  earthquakes.  We 
focus  on  this  case  first.  (Monitoring  down  to  magnitude  2.5  poses  greater  difficulty.)  In 
Section  4.2,  we  also  consider  the  case  in  which  earthquake  data  sets  are  contaminated  by 
large  quarry  blasts  or  rock  bursts  and  assess  how  this  impacts  monitoring  performance. 

A  standard  set  of  discriminants  is  used,  unless  we  know  that  particular  ones  work  best  for 
a  given  region.  In  this  study  we  use  Pn/Lg  and  Pn/Sn  in  the  3-5, 4-6,  5-7  and  6-8  Hz  bands, 
as  well  as  an  Lg  spectral  ratio,  provided  measurements  were  available  for  the  events.  In 
general,  the  outlier  test  can  rigorously  include  any  discrete  or  continuous  discriminant  from 
single  or  multiple  stations  and  arrays.  Distance  corrections  are  applied  if  information  is 
available.  Each  event  is  then  tested  as  an  outlier  of  the  remaining  data  set  using  a  hypothesis 
test  based  on  the  generalized  likelihood  ratio.  (Technical  details  of  the  outlier  test  and 
previous  applications  are  provided  by  Baek  et  al.,  1992;  Fisk  et  al.,  1993a,  1993b;  Fisk  and 
Gray,  1993;  Fisk,  1993.)  Events  flagged  as  outliers  can  be  investigated  further  using, 
possibly,  non-seismic  means.  The  significance  level  of  the  test  is  an  input  parameter  which 
controls  the  false  alarm  rate,  e.g.,  the  percentage  of  earthquakes  flagged  as  outliers. 
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The  basic  idea  behind  the  likelihood  ratio  test  is  to  form  the  ratio  of  likelihoods, 

^  _  max  L(parameters  I  data;  under  null  hypothesis) 

max  L(parameters  I  data;  under  alternative  hypothesis)  ’ 

where  the  numerator  is  computed  under  the  null  hypothesis  being  tested,  given  the  data,  and 
the  denominator  is  computed  under  the  alternative  hypothesis.  For  the  outlier  test,  the  null 
hypothesis  is  that  the  event  being  tested  belongs  to  the  same  population  as  the  remainder  of 
the  events.  The  alternative  hypothesis  is  that  it  does  not.  Since  the  tme  parameters  of  the 
likelihood  functions  are  unknown,  maximum  likelihood  estimates  (MLEs)  are  computed 
from  the  data,  subject  to  the  particular  hypothesis,  and  inserted  into  the  likelihood 
functions.  This  yields  the  ratio  of  maximized  likelihoods.  The  MLEs  of  the  unknown 
parameters  (e.g.,  the  mean  and  covariance  matrix  of  the  discriminants)  are  computed 
subject  to  these  two  hypotheses. 

Small  values  of  X  indicate  that  the  null  hypothesis  should  be  rejected.  To  quantify  what  is 
meant  by  “small”,  the  distribution  of  X  is  needed.  In  general,  its  distribution  is  quite 
complicated,  depending  on  the  multivariate  distribution  of  the  the  discriminants.  To 
estimate  it  empirically,  we  use  the  bootstrap  technique  (Efron,  1979)  to  generate  random 
samples  from  the  distribution  of  actual  data.  Bootstrapped  data  is  inserted  in  the  likelihood 
ratio  for  many  samples  to  obtain  its  distribution.  From  this  distribution,  the  critical  value  is 
set  so  that  the  test  has  a  specified  false  alarm  rate.  All  events  whose  likelihood  ratio  are  less 
than  the  critical  value  are  considered  outliers  at  the  specified  significance  level. 

The  key  innovation  of  our  approach  is  the  combination  of  the  likelihood  ratio  and  bootstrap 
techniques,  which  allows  application  to  any  discriminant  distribution  for  which  the  MLEs 
exist,  while  controlling  the  error  rate  associated  with  a  particular  type  of  event.  In  addition, 
the  likelihood  ratio  is  a  useful  metric  with  which  to  rank  events,  by  combining  multivariate 
discriminant  data  in  a  univariate  expression.  (This  capability  is  illustrated  in  Section  4.) 
This  provides  an  alternative  to  imposing  a  rigid  “yes/no”  judgement  and  a  means  to  focus 
on  the  most  “suspicious”  events. 

3.3.  Classification  Approach 

For  regions  in  which  training  data  are  available  for  two  or  more  types  of  events  (e.g.,  for 
any  combination  of  two  or  more  training  sets  in  a  region  which  include  earthquakes,  quarry 
blasts,  rock  bursts  or  nuclear  explosions),  we  developed  a  classification  test,  based  also  on 
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the  likelihood  ratio  methodology  (e.g.,  Baek  et  al.,  1993;  Fisk  et  al.,  1993a).  In  this  case  the 
likelihood  functions  are  extended  to  treat  training  data  for  two  event  classes  and  the  event 
in  question  is  allocated  into  one  of  two  classes.  (The  test  is  applied  sequentially  to  cases 
with  more  than  two  training  sets.  Future  work  is  planned  to  extend  the  methodology  to 
explicityly  treat  more  than  two  event  classes.)  This  method  provides  a  hypothesis  test  in 
which  one  of  the  error  rates  is  controlled  or  a  classification  test  in  the  conventional  sense 
in  which  the  overall  error  rate  or  a  cost  function  is  minimized.  Like  the  outlier  test,  it  also 
allows  events  to  be  ranked  straightforwardly.  Fisk  et  al.  (1993a)  showed  that  this  test  has 
greater  accuracy  than  the  outlier  test  if  more  information  is  available. 

The  explicit  form  of  the  outlier  and  classification  tests  depend  on  the  distributions  of  the 
discriminants  used  which,  in  general,  may  differ  for  different  sets  of  discriminants  and 
event  types.  Thus,  we  have  implemented  statistical  algorithms  to  test  appropriate 
assumptions  and,  if  necessary,  to  transform  the  data  to  a  form  which  ensures  validity.  Also, 
in  many  cases  some  of  the  discriminant  values  are  missing  for  particular  events.  This  may 
be  due,  for  example,  to  blockage  or  strong  attenuation  of  particular  seismic  phases,  poor 
signal-to-noise,  interference  by  other  events,  or  instrament  malfunction.  Gray  et  al.  (1994) 
generalized  the  methods  to  treat  missing  discriminant  measurements  for  some  of  the  events. 
We  have  also  implemented  an  algorithm  to  automatically  select  the  best  set  of 
discriminants  for  a  given  region  if  adequate  training  sets  are  available. 


4.  Monitoring  Applications 

4.1.  Applications  of  the  Outlier  Test 

Using  the  procedure  described  in  Section  3.2,  we  sequentially  used  each  of  the  nuclear 
explosions  in  each  region  as  the  “new”  explosion.  Each  explosion  was  tested  against  the 
earthquakes  to  determine  the  percentage  identified  as  outliers.  We  also  considered  each  of 
the  quarry  blasts  as  the  “new”  explosion  in  their  respective  regions  to  supplement  the 
nuclear  explosion  data.  (Note  that  it  is  generally  more  difficult  to  discriminate  earthquakes 
from  quarry  blasts  than  contained  nuclear  explosions.  Thus,  if  quarry  blasts  are  detected  as 
outliers  of  the  earthquake  group,  a  nuclear  explosion  is  also  very  likely  to  be  detected.) 
Each  earthquake  was  also  sequentially  tested  using  the  leave-one-out  procedure  to  estimate 
the  false  alarm  rate.  In  the  following,  we  set  the  significance  level  at  0.01  which,  in  practice, 
should  lead  to  a  false  alarm  rate  of  1%.  Doing  this,  we  obtained  the  following  results. 
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For  ARCESS,  using  the  Steigen  and  Spitsbergen  earthquakes  to  train  the  outlier  test,  3  of 
3  Novaya  Zemlya  nuclear  explosions,  50  of  51  (98%)  Kola  quarry  blasts,  and  36  of  39 
(92%)  Kiruna  quarry  blasts  were  flagged  as  outliers  once  distance  corrections  were  applied 
to  the  discriminants.  For  GERESS,  using  the  Vogtiand  earthquakes  to  train  the  outlier  test, 
13  of  13  Vogtiand  quarry  blasts  were  detected  as  outliers.  There  were  no  false  alarms  at 
either  ARCESS  or  GERESS.  For  CDSN  station  WMQ,  17  of  17  Balapan  and  Lop  Nor 
nuclear  explosions  were  flagged  as  outliers  with  1  false  alarm  out  of  23  earthquakes.  For 
the  LNN  stations,  74  of  78  (95%)  and  71  of  89  (80%)  NTS  nuclear  explosions  were 
detected  as  outliers  at  MNV  and  KNB,  respectively,  with  one  false  alarm  at  each  station. 

These  results  show  that  useful  monitoring  can  be  performed  with  the  outlier-detection 
approach,  currently  down  to  magnitude  3,  for  regions  that  are  well-covered  by  at  least  one 
seismic  station  or  array.  Between  92-100%  of  the  explosions  and  quarry  blasts  were 
detected  as  outliers  of  the  earthquake  groups  in  the  various  regions,  except  at  KNB  where 
the  detection  rate  was  80%.  There  was  one  false  alarm  at  each  of  the  stations  KNB,  MNV 
and  WMQ.  Overall,  264  of  290  (91%)  explosions  were  detected  and  there  were  only  3  false 
alarms  out  of  158  earthquakes  (1.9%),  slightly  higher  than  the  target  rate  of  1%.  Figure  18 
summarizes  these  results,  which  were  obtained  for  diverse  geological  regions,  seismic 
stations  and  arrays,  and  for  a  wide  range  of  epicentral  distances  and  magnitudes. 

Outlier  Detection  and  False  Alarm  Rates 


Figure  18.  Summary  of  outlier  test  results  (detection  and  false  alarm  rates)  at  ARCESS,  GERESS, 
WMQ,  MNV  and  KNB.  Overall  results  are  shown  on  the  far  right. 
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Results  of  the  outlier  tests  for  the  various  stations  are  illustrated  in  Figures  19-22.  Figure 
19  shows  outlier  test  results  for  the  events  observed  at  WMQ.  The  distribution  shown  is  of 
the  likelihood  ratio  when  the  event  being  tested  is  from  the  same  group  as  the  earthquake 
training  set,  i.e.,  when  the  null  hypothesis  is  true.  The  vertical  line  represents  the  threshold 
of  the  test  for  the  significance  level  listed  in  the  lowest  legend.  Events  whose  likelihood 
ratios  are  less  than  the  threshold  are  identified  as  outliers  at  the  corresponding  significance 
level.  The  triangles  and  circles  depict  values  of  the  likelihood  ratio  for  the  explosions  and 
earthquakes  being  individually  tested,  respectively.  The  middle  legend  lists  the 
discriminants  used.  (Recall  that  Pn/Sn  was  not  available  for  most  of  the  WMQ  explosions.) 


Outlier  Test  Results;  WMQ 


Log(MLR) 


Figure  19.  Graphical  representation  of  outlier  test  results  for  the  WMQ  events.  The  triangles  and 
circles  show  values  of  the  likelihood  ratio  for  the  explosions  and  earthquakes  being  tested,  respectively. 
Event  whose  likelihood  ratios  are  less  than  the  threshold  (vertical  line)  are  identified  as  outliers. 

Figure  19  shows  that  all  17  nuclear  explosions  are  flagged  as  outliers  of  the  earthquake 
group  at  0.01  significance  level.  Figure  19  also  illustrates  how  the  likelihood  ratio 
combines  multivariate  data  (e.g.,  values  of  five  different  discriminants  in  this  case)  into  a 
single  variable  which  allows  events  to  be  ranked  in  a  statistically  consistent  manner,  rather 
than  providing  a  “yes/no”  decision.  Events  with  the  smallest  likelihood  ratios  (out  in  the 
tail)  would  be  considered  the  most  suspicious  and  could  be  investigated  first. 
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Outlier  Test  Results:  ARAO  Novaya  Zemlya  Outlier  Test  Results:  ARAO  Kola  Penisula 
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Figure  20.  Graphical  representations  of  outlier  test  results  for  Novaya  Zemlya  nuclear  explosions  (upper  left),  Kola  quarry  blasts  (upper  right),  Kiruua 
quarry  blasts  (lower  left),  and  Vogtland  quarry  blasts  (lower  right).  The  triangles  depict  values  of  the  likelihood  ratio  for  the  explosions  while  the  circles 
correspond  to  values  of  the  likelihood  ratio  for  the  earthquakes  considered  at  each  array. 


Figure  20  shows  similar  results  of  outlier  tests  performed  on  events  recorded  by  ARCESS 
and  GERES  S.  The  upper  left  plot  shows  that  all  three  of  the  Novaya  Zemlya  nuclear 
explosions  recorded  by  ARCESS  were  detected  as  outliers  at  0.01  significance  level.  The 
Steigen  and  Spitsbergen  earthquakes  were  used  as  training  data  in  this  case.  Likewise,  the 
upper  right  plot  shows  results  for  the  Kola  quarry  blasts  seen  at  ARCESS.  The  five 
Spitsbergen  earthquakes  were  not  used  in  this  case  to  make  use  of  Pn/Lg  which  was 
measured  for  the  Kola  and  Steigen  events.  The  lower  left  plot  shows  that  36  of  the  39 
Kiruna  quarry  blasts  recorded  by  ARCESS  were  detected  as  outliers.  The  lower  right  plot 
shows  that  all  13  of  the  Vogtland  quarry  blasts  recorded  by  GERES  S  were  identified  as 
outliers.  There  were  no  false  alarms  for  any  of  these  cases. 

Figure  21  shows  results  of  outlier  tests  performed  on  events  recorded  by  MNV.  In  this  case, 
74  of  78  nuclear  explosions  were  detected  as  outliers  at  0.01  significance  level  and  there 
was  1  false  alarm  out  of  37  earthquakes.  Last,  Figure  22  shows  results  of  outlier  tests 
performed  on  events  recorded  by  KNB.  In  this  case,  71  of  89  nuclear  explosions  were 
detected  as  outliers  and  there  was  1  false  alarm  out  of  59  earthquakes. 


Outlier  Test  Results:  MNV 
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Figure  21.  Outlier  test  results  for  MNV  events  with  both  discriminant  values  present.  The  triangles  and 
circles  show  values  of  the  likelihood  ratio  for  the  explosions  and  earthquakes  being  tested,  respectively. 
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Outlier  Test  Results:  KNB 
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Figure  22.  Same  as  Figure  21  but  for  events  recorded  by  KNB  with  both  discriminant  values  present. 
4.2.  Contaminated  Training  Set  Study 

We  have  assumed  to  this  point  that  the  data  sets  in  each  region  consist  of  earthquakes  and 
at  most  one  explosion.  In  practice,  data  sets  will  also  consist  of  quarry  blasts  and  rock 
bursts,  but  we  may  not  know  this  a  priori.  Thus,  we  should  be  concerned  with  the  situation 
in  which  quarry  blasts  and  rock  bursts  may  contaminate  earthquake  training  sets  when 
trying  to  detect  a  nuclear  explosion.  Although  there  are  likely  to  be  only  a  small  number  of 
quarry  blasts  above  magnitude  3,  it  is  possible  that  a  couple  of  large  quarry  blasts  might 
contaminate  the  earthquake  training  set.  Large  (mb  >  3)  rock  bursts  or  mine  tremors  could 
also  occur  and  be  present  in  data  sets.  The  purpose  of  this  study  is  to  determine  how  a  lack 
of  ground-truth  affects  our  ability  to  monitor  using  the  outlier  test. 

To  address  this  problem,  we  intentionally  included  quarry  blasts  or  rock  bursts  in 
earthquake  training  sets  to  determine  if  the  outlier  test  can  detect  them  and  to  assess 
potential  impacts  on  identification  performance.  (Note  that  in  Section  4.1,  we  considered 
each  quarry  blast  to  be  a  “new”  generic  explosion  for  the  sake  of  estimating  a  lower  bound 


27 


on  the  detection  rate  of  nuclear  explosions  as  outliers  for  regions  in  which  nuclear 
explosion  data  was  lacking  or  minimal.  Here  we  consider  the  quarry  blasts  as  unique  events 
which  can  degrade  our  ability  to  identify  nuclear  explosions.) 

We  first  selected  2  Vogtland  quarry  blasts  and  inserted  them  in  the  Vogtland  earthquake 
set.  To  make  the  problem  interesting,  we  chose  the  2  that  are  most  like  the  earthquakes.  We 
then  ran  the  outlier  test  on  each  event  in  the  training  set  using  the  leave-one-out  procedure 
and  the  same  discriminants  as  above.  Note  that  even  if  one  quarry  blast  is  left  out  of  the 
training  set  to  be  tested,  one  contaminating  event  still  remains.  If  an  outlier  is  detected,  it 
is  removed  from  the  set  and  the  remainder  are  tested  again.  In  this  case,  both  quarry  blasts 
were  flagged  on  the  first  pass.  If  the  2  quarry  blasts  are  not  removed  from  the  earthquake 
training  set,  2  of  the  remaining  1 1  Vogtland  quarry  blasts  are  undetected  when  tested. 

This  analysis  was  repeated  by  randomly  selecting  2  Kola  Peninsula  quarry  blasts  and 
inserting  them  in  the  Steigen  earthquake  set.  Only  one  quarry  blast  was  detected  on  the  first 
pass,  but  once  removed,  the  other  was  also  detected.  If  the  2  Kola  quarry  blasts  are  not 
removed  from  the  Steigen  earthquake  training  set,  7  of  the  remaining  51  quarry  blasts  are 
undetected  when  tested.  However,  even  if  both  Kola  quarry  blasts  were  not  detected,  the 
outlier  test  was  still  able  to  detect  all  3  Novaya  Zemlya  nuclear  explosions  as  outliers. 
(ARCESS  was  the  only  array  for  which  we  have  data  of  these  three  types.) 

As  a  related  study,  we  contaminated  the  Vogtland  earthquake  set  with  30  Lubin  rock  bursts 
and  then  tested  the  Vogtland  quarry  blasts.  All  13  Vogtland  quarry  blasts  were  still  flagged 
as  outliers.  This  is  primarily  due  to  the  fact  that  Pn/Lg  measurements  for  the  Vogtland 
quarry  blasts  are  considerably  higher  than  those  for  the  Vogtland  earthquakes  and  Lubin 
rock  bursts  (cf.  Figure  10). 

These  results  show  that  the  outlier  test  can  be  used  to  detect  contaminating  events,  in 
proportions  of  10-20%,  in  data  sets  which  lack  ground-truth.  If  they  are  not  detected  and 
contaminated  data  sets  are  used  to  train  the  outlier  test,  the  probability  of  detecting  an 
explosion  may  be  degraded  significantly,  by  15-20%  or  more,  although  it  did  not  affect  the 
ability  to  identify  the  Novaya  Zemlya  explosions  as  outliers.  Contamination  by  rock  bursts 
did  not  reduce  the  capability  to  detect  mining  explosions  as  outliers  in  the  case  examined, 
but  should  be  studied  more  fully. 
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4.3.  Applications  with  the  Classification  Test 

Since  we  actually  have  well-defined  training  sets  for  more  than  one  event  type  in  the 
various  regions  considered,  we  can  use  the  classification  test  in  these  regions  to  identify 
new  events.  Here  we  perform  a  study  similar  to  that  in  Section  4.1  to  estimate  the  rate  at 
which  explosions  are  identified  correctly  and  the  false  alarm  rate.  Again  we  use  the  leave- 
one-out  procedure,  in  which  each  event  being  tested  is  sequentially  removed  from  the  data 
sets  used  to  train  the  classification  test.  The  objectives  here  are  to  illustrate  how  regions  can 
be  monitored  with  the  classification  test  if  multiple  training  sets  are  available  and  to  assess 
the  improvement  in  monitoring  performance  this  test  offers  over  the  outlier  test.  The 
significance  level  was  again  set  at  0.01.  Here  we  summarize  the  results  of  this  study. 

First,  the  classification  results  for  ARCESS  and  GERESS  are  equivalent  to  those  using  the 
outlier  test.  For  ARCESS,  using  the  Novaya  Zemlya  explosions  and  Steigen/Spitsbergen 
earthquakes  to  train  the  test,  3  of  3  Novaya  Zemlya  explosions  and  all  29  earthquakes  were 
correctly  identified  using  the  distance-corrected  discriminants  listed  in  Section  4.1.  Similar 
tests  correctly  identified  all  24  Steigen  earthquakes,  50  of  51  (98%)  Kola  quarry  blasts,  and 
36  of  39  (92%)  Kiruna  quarry  blasts  once  distance  corrections  were  applied.  For  GERESS, 
using  the  Vogtland  earthquakes  and  quarry  blasts  to  train  the  classification  test,  10  of  10 
earthquakes  and  13  of  13  quarry  blasts  were  identified  correctly.  Thus,  as  before,  there  were 
no  false  alarms  at  ARCESS  or  GERESS,  while  96%  and  100%  of  the  explosions  were 
identified  correctly  at  ARCESS  and  GERESS,  respectively.  Four  of  the  90  quarry  blasts  are 
still  found  to  be  more  consistent  with  the  earthquakes  than  with  the  other  quarry  blasts. 

Second,  the  classification  test  correctly  identified  all  WMQ  event;  23  of  23  earthquakes  and 
17  of  17  Balapan  and  Lop  Nor  nuclear  explosions  were  identified  correctly.  This  is  an 
improvement  over  the  outlier  test  for  which  there  was  one  false  alarm  at  WMQ. 

Third,  the  classification  test  provides  greatest  improvement  over  the  outlier  test  for  the 
LNN  stations.  For  MNV,  78  of  78  NTS  nuclear  explosions  and  37  of  37  earthquakes  were 
identified  correctly.  Recall  that  the  outlier  test  misidentified  4  of  the  NTS  explosions  and  1 
of  the  earthquakes  recorded  by  MNV.  For  KNB,  75  of  89  (84%)  NTS  nuclear  explosions 
and  58  of  59  (98%)  earthquakes  were  identified  correctly.  Note  that  there  was  also  one  false 
alarm  at  KNB  using  the  outlier  test  and  4  additional  NTS  explosions  were  misidentified. 

Figure  23  illustrates  classification  results  for  the  events  recorded  by  MNV.  The  distribution 
on  the  right  is  of  the  likelihood  ratio  which  was  bootstrapped  from  the  earthquake  and 
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explosion  training  data,  and  the  new  event  was  bootstrapped  from  the  earthquake  set.  The 
distribution  on  the  left  is  similar  except  that  the  new  event  was  bootstrapped  from  the 
explosion  set.  The  vertical  lines  correspond  to  thresholds  of  the  test  for  various  significance 
levels  listed  in  the  lowest  legend.  The  thick  vertical  line  corresponds  to  the  classification 
rule  which  minimizes  the  total  error  rate.  The  triangles  correspond  to  values  of  the 
likelihood  ratio  for  the  explosions  being  tested,  while  the  circles  correspond  to  similar 
values  for  the  earthquakes.  This  plot  shows  that  all  events  are  correctly  identified  at  0.01 
significance  level  and  for  the  classification  rule  which  minimizes  the  total  error  rate. 
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Figure  23.  Graphical  representation  of  classification  results  for  78  NTS  nuclear  explosions  (triangles) 
and  37  earthquakes  (circles)  recorded  by  MNV.  All  events  were  classifled  correctly. 

Figure  24  illustrates  similar  classification  results  for  the  events  recorded  by  KNB.  This  plot 
shows  that  75  of  89  (84%)  NTS  explosions  and  58  of  59  earthquakes  are  classified  correctly 
using  the  threshold  at  0.01  significance  level.  Plots  of  the  classification  results  at  station 
WMQ  and  the  ARCESS  and  GERESS  arrays  are  similar. 

The  classification  results  illustrate  the  improved  identification  performance  of  the 
classification  test  over  the  outlier  test  for  regions  in  which  multiple  training  sets  exist. 
Between  96-100%  of  the  explosions  and  quarry  blasts  in  the  various  regions  were 
identified  correctly,  except  at  KNB  where  the  identification  rate  was  84%.  Also,  there  was 
only  one  false  alarm  at  KNB  using  the  classification  test.  Overall,  272  of  290  (94%) 
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explosions  were  identified  correctly  and  there  was  only  1  false  alarm  out  of  158 
earthquakes  (0.6%),  less  than  the  target  rate  of  1%.  Figure  25  summarizes  these  results. 


Classification  Results:  KNB 
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Figure  24.  Graphical  representation  of  classification  results  for  89  NTS  nuclear  explosions  (triangles) 
and  59  earthquakes  (circles)  recorded  by  KNB.  In  this  case,  75  of  89  explosions  and  58  of  59 
earthquakes  were  classified  correctly  at  0.01  significance  level. 


Explosion  Classification  and  False  Alarm  Rates 
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Figure  25.  Summary  of  classification  results  at  ARCESS,  GERESS,  WMQ,  MNV  and  KNB.  Results 
are  expressed  in  terms  of  the  percentage  of  explosions  correctly  classified  and  the  false  alarm  rate  at 
each  station.  Overall  results  are  shown  on  the  far  right. 
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4.4.  IdentiOcation  Results  for  the  921231  Novaya  Zemlya  Event 

Here  we  examine  the  identification  of  the  31  December  1992  (921231)  event  on  Novaya 
Zemlya  as  a  special  case.  Based  on  origin  analysis  by  the  Intelligent  Monitoring  System 
(e.g.,  Bache  et  al.,  1990),  a  small  regional  event  (Ml  2.26)  occurred  at  origin  time  12/31/ 
92  09:29:24  (GMT),  latitude  73.58  and  longitude  55.21  on  Novaya  Zemlya.  (This  event  is 
also  referred  to  by  its  origin  identification  number  361575.)  Considerable  interest  in  this 
event  is  motivated  by  its  relevance  to  future  CTBT  or  NPT  monitoring  scenarios  since  it 
occurred  in  an  area  of  very  low  seismicity,  had  magnitude  corresponding  to  a  fully 
decoupled  1  kt  nuclear  test,  and  was  detected  by  only  a  few  arrays  at  regional  distances 
(Ryall,  1993).  Of  further  interest  is  the  fact  that  it  occurred  in  a  region  where  previous 
underground  nuclear  tests  had  been  conducted. 

Several  ARPA  contractors  previously  analyzed  the  identification  of  the  921231  event  and 
their  results  are  summarized  by  Ryall  (1993).  They  generally  agreed  that  this  event  was  not 
a  nuclear  detonation.  Results  obtained  by  Baumgardt  (1993b),  Fisk  and  Gray  (1993),  and 
Pulli  and  Dysart  (1993)  suggest  that  this  event  was  much  more  like  Kola  Peninsula  mining 
blasts  than  earthquakes  in  Scandinavia  and  near  Spitsbergen,  in  apparent  contradiction  to  a 
statement  by  the  Seismological  Service  of  the  Ministry  of  Defense,  Russian  Federation, 
that  no  blasting  had  occurred  on  the  Novaya  Zemlya  test  range  on  the  day  in  question 
(Ryall,  1993). 

Discriminants  used  to  obtain  the  previous  results  were  based  on  Pn  and  Sn  measurements 
of  ARCESS  recordings.  No  corrections  were  made  for  attenuation,  although  all  authors 
recognized  their  importance  on  the  outcome  of  identification  tests  since  the  epicentral 
distances  of  the  921231  event  and  reference  events  ranged  from  300  to  1300  km  and  it  is 
well  known  Pn/Sn  is  affected  by  distance.  More  rapid  attenuation  of  Sn  with  distance 
relative  to  Pn  would  cause  the  921231  event  to  have  higher  Pn/Sn  ratios  than  a  similar  event 
occurring  in  the  Steigen  or  Kola  Peninsula  regions  due  to  the  longer  path  from  Novaya 
Zemlya.  It  was  speculated  that  applying  distance  corrections  may  lead  to  Pn/Sn  ratios  that 
are  more  consistent  with  earthquakes  than  with  quarry  blasts. 

The  objectives  of  this  study  are  to  resolve  the  apparent  discrepancy  and  to  assess  the  impact 
of  attenuation  on  regional  event  identification.  Using  distance  corrections  reported  by 
Sereno  (1990),  we  reanalyzed  the  identification  of  the  921231  event.  As  in  the  study  by 
Fisk  and  Gray  (1993),  the  feature  data  used  here  were  obtained  from  seismic  analysis 
performed  by  Baumgardt  (1993b)  using  the  Intelligent  Seismic  Event  Identification  System 
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(ISEIS)  (e.g.,  Baumgardt  et  al.,  1991).  Discriminants  consist  of  Pn/Sn  ratios  of  maximum 
amplitude  measurements  in  five  frequency  bands,  4-6, 5-7, 6-8,  8-10,  and  8-16  Hz  recorded 
by  ARAO.  For  comparison,  we  again  used  3  nuclear  explosions  at  the  Novaya  Zemlya  test 
site,  the  Kola  and  Steigen  data  sets,  and  5  earthquakes  near  Spitsbergen.  Figure  26  shows 
the  locations  of  these  events  and  the  ARCFSS  array.  Figure  9  above  shows  scatter  plots  of 
the  Pn/Sn  measurements  for  these  events  before  and  after  applying  distance  corrections. 
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Figure  26.  Locations  of  the  921231  event  (ORID=361575)  on  Novaya  Zemlya,  reference  events  used  in 
its  identification  analysis,  and  the  ARCESS  array  at  which  all  events  were  recorded. 


After  first  applying  distance  corrected  to  these  discriminants,  we  applied  the  outlier  and 
classification  tests  to  the  92123 1  event.  Since  there  are  training  data  for  three  groups  in  this 
region,  (i.e.,  for  nuclear  explosions,  earthquakes  and  mining  blasts)  the  classification  test 
was  used  to  allocate  the  921231  event  into  one  of  them.  The  outlier  test  was  also  used  to 
illustrate  how  it  would  be  applied  to  identifying  this  event  if  only  any  one  of  the  three 
training  sets  existed.  We  also  examined  the  range  of  significance  levels  for  which  various 
hypothesis  are  rejected.  This  type  of  in-depth  analysis  using  multiple  tests  illustrates  how 
an  expert  analyst  might  use  these  tools  in  an  interactive  mode  to  examine  suspicious  or 
problematic  events.  The  following  results  and  observations  were  obtained.  (Further  details 
are  provided  by  Fisk,  1993.) 
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First,  there  is  sufficient  evidence  to  reject  the  921231  event  as  a  member  of  the  Novaya 
Zemlya  nuclear  explosion  group  at  0.01  significance  level,  using  either  the  outlier  or 
classification  tests.  This  is  consistent  with  the  results  of  Fisk  and  Gray  (1993)  as  expected 
since  the  epicentral  distances  for  the  Novaya  Zemlya  events  are  all  very  similar;  thus, 
attenuation  corrections  should  have  little  effect  on  the  identification  relative  to  this  group. 

Second,  based  on  separate  outlier  tests,  there  is  insufficient  evidence  to  reject  the  921231 
event  as  a  member  of  either  the  quarry  blast  or  earthquake  groups  at  0.05  significance  level 
or  below.  Thus,  using  the  monitoring  procedure  described  in  Section  3.2,  this  event  would 
not  be  flagged  as  an  outlier  for  further  analysis  based  solely  on  Pn/Sn  measurements. 
However,  it  would  be  flagged  as  an  outlier  if  we  included  the  seismicity  of  Novaya  Zemlya 
as  a  discrimination  parameter  since  the  seismicity  of  this  region  is  very  low.  Furthermore, 
there  is  no  commercial  mining  activity  on  Novaya  Zemlya  while  there  is  a  nuclear  test  site. 

Third,  classification  between  earthquake  and  quarry  blast  alternatives  led  to  rejection  of 
this  event  as  a  member  of  the  Kola  quarry  blast  group  at  0.05  significance  level,  although 
it  was  also  rejected  as  a  member  of  the  earthquake  group  using  Pn/Sn  in  the  five  frequency 
bands.  Omitting  the  6-8  Hz  band  (which  was  found  to  be  inconsistent  with  the  values  in  the 
other  four  bands  for  the  921231  event),  it  was  rejected  as  a  quarry  blast  at  0.01  significance 
level  and  accepted  as  an  earthquake.  These  results  are  quite  different  than  those  of  Fisk  and 
Gray  (1993),  Baumgardt  (1993b)  and  Pulli  and  Dysart  (1993),  who  found  that  the  921231 
event  was  much  more  consistent  with  identification  as  a  mining  blast  than  as  an  earthquake. 
Although  identification  of  this  event  is  still  somewhat  inconclusive,  due  to  its  Pn/Sn  value 
in  the  6-8  Hz  band,  it  is  no  longer  inconsistent  with  the  statement  by  the  Seismological 
Service  of  the  Ministry  of  Defense,  Russian  Federation.  This  study  also  demonstrates  that 
distance  corrections  can  have  a  significant  impact  on  event  identification  and  should  be 
applied  routinely. 

Among  other  important  monitoring  issues,  we  found  that  distance-corrected  Pn/Sn  values 
for  the  3  Novaya  Zemlya  nuclear  tests  fall  well  within  the  values  for  the  Kola  quarry  blast 
group  (cf  Figure  9).  Even  uncorrected  values  are  equivalent  to  those  of  quarry  blasts, 
except  for  the  bands  above  8  Hz.  This  implies  that  a  more  effective  discriminator  between 
nuclear  explosions  and  mining  blasts  is  needed.  Spectral  and  cepstral  variances,  presence 
of  cepstral  peaks,  and  presence  of  Rg  have  been  considered  to  identify  mining  blasts,  but 
adequate  utility  of  any  is  yet  to  be  demonstrated. 
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We  also  found  that  overlapping  or  adjacent  bands  of  high-frequency  Pn/Sn,  based  on 
maximum  amplitude  measurements,  can  provide  different  evidence.  In  this  case,  Pn/Sn  in 
the  6-8  Hz  band  was  anomalously  higher  than  in  the  other  bands  for  the  921231  event 
(Figure  9).  This  may  indicate  that  a  more  robust  type  of  Pn/Sn  measurement  is  needed. 

5.  Conclusions  and  Recommendations 

Results  presented  in  Section  4.1  show  that  useful  monitoring  can  be  performed  with  the 
outlier-detection  approach,  currently  down  to  magnitude  3,  for  regions  that  are  well- 
covered  by  at  least  one  seismic  station  or  array.  Using  Pn/Lg  and  Pn/Sn  in  several  3-8  Hz 
bands,  between  92-100%  of  the  explosions  and  quarry  blasts  were  detected  as  outliers  of 
the  earthquake  groups  in  the  various  regions,  except  at  KNB  where  the  detection  rate  was 
80%.  (An  Lg  spectral  ratio  was  also  used  at  KNB  and  MNV  since  we  only  had  Pn/Lg  in  the 
6-8  Hz  band  for  these  events.)  There  were  no  false  alarms  at  ARCESS  or  GERESS,  while 
there  was  one  false  alarm  at  each  of  KNB,  MNV  and  WMQ.  Overall,  264  of  290  (91%) 
explosions  were  detected  and  there  were  only  3  false  alarms  out  of  158  earthquakes  (1.9%), 
slightly  higher  than  the  target  rate  of  1%.  These  results  were  obtained  for  diverse  regions, 
for  a  wide  range  of  epicentral  distances  and  magnitudes,  and  for  single  stations  and  arrays. 

Although  these  preliminary  results  are  encouraging  for  monitoring  down  to  magnitude  3, 
improvements  are  still  needed.  The  relatively  high  detection  rate  of  explosions  as  outliers 
should  provide  a  useful  deterrent  to  potential  clandestine  testing.  From  a  monitoring 
perspective,  however,  it  is  desired  that  all  underground  nuclear  tests  be  identified.  In 
addition,  the  false  alarm  rate  is  relatively  low  and  could  be  set  even  lower.  This  greatly 
alleviates  the  burden  on  human  analysts.  In  practice,  however,  there  will  be  many  more 
earthquakes  to  contend  with  than  the  number  consider  in  this  study,  with  a  corresponding 
percentage  identified  as  outliers.  Although  a  false  alarm  by  the  outlier  test  would  not 
necessitate  an  on-site  inspection,  for  example  (these  would  simply  be  included  with  a  set 
of  events  to  be  examined  in  more  detail),  it  is  important  to  reduce  the  false  alarm  rate  further 
while  maintaining  a  high  detection  rate  of  explosions  as  outliers. 

One  or  more  additional  discriminants  that  work  about  as  well  as  high-frequency  Pn/Sn  and 
Pn/Lg,  but  that  are  relatively  uncorrelated,  or  information  which  allows  us  to  choose  the 
best  discriminants  and  frequency  bands  in  a  given  region,  should  improve  regional 
identification  performance  further.  Although  Lg  spectral  ratios  discriminate  in  some 
regions  (e.g.,  the  western  U.S.  and  Germany),  they  do  not  in  others  (e.g.,  Scandinavia  and 
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Eurasia).  Thus,  we  are  hesitant  to  use  them  in  an  operational  setting  until  further  physical 
understanding  of  where  and  why  they  discriminate  is  obtained. 

Contaminated  training  data  did  not  reduce  the  monitoring  capability  for  the  cases  studied 
but  remains  a  cause  for  concern,  especially  at  low  magnitude.  We  also  found  that  it  is  very 
important  to  distance-correct  discriminants  in  order  to  obtain  accurate  identification 
results.  Examples  were  shown  in  which  the  880929  Lop  Nor  explosion  and  the  921231 
Novaya  Zemlya  event  would  be  misidentified  using  discriminants  uncorrected  for  distance. 
Transporting  regional  discrimination  rules  appears  very  difficult  and  must  be  done  very 
precisely  in  order  to  have  a  reasonable  probability  of  detecting  a  nuclear  explosion  as  an 
outlier  without  having  an  unmanageable  number  of  false  alarms  worldwide.  A  key  benefit 
of  the  outlier  approach  is  that  it  does  not  require  transporting  discrimination  thresholds, 
while  providing  high  outlier-detection  and  low  false  alarm  rates  in  regions  studied  thus  far. 

Results  presented  in  Section  4.3  show  that  the  classification  test  provides  improved 
identification  accuracy  over  the  outlier  test  for  cases  in  which  well-defined  training  sets  are 
available  for  more  than  one  type  of  event.  Between  96-100%  of  the  explosions  and  quarry 
blasts  in  the  various  regions  were  identified  correctly,  except  at  KNB  where  the 
identification  rate  of  NTS  nuclear  explosions  was  84%.  Also,  there  was  only  one  false 
alarm  which  occurred  at  KNB  using  the  classification  test.  Overall,  272  of  290  (94%) 
explosions  were  identified  correctly  and  there  was  only  1  false  alarms  out  of  158 
earthquakes  (0.6%),  less  than  the  target  rate  of  1%.  These  results  illustrate  how  the 
classification  test  could  be  used  to  monitor  existing  nuclear  test  sites,  for  example,  with 
very  high  identification  accuracy. 

Future  research  should  focus  on  investigating  ways  to  select  the  best  regional  discriminants 
in  the  absence  of  nuclear  explosion  training  data  and  on  investigating  other  discriminants 
such  a  regional  Ms-mb  for  mb  <  4  (e.g..  Dainty  and  Kushnir,  1994;  Herrin  et  ah,  1994),  a 
discriminant  of  earthquakes  and  explosions  based  on  autoregression  analysis  of  Lg  (e.g., 
Herrin  et  al.,  1994),  waveform  complexity  (e.g.  Blandford,  1993),  etc.  In  addition,  reliable 
automated  methods  are  needed  to  distinguish  nuclear  explosions  from  chemical  mining 
blasts,  particularly  to  improve  the  seismic  monitoring  capability  below  magnitude  3.  Last, 
a  distance-correction  data  base  should  be  established. 
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